Testing SDRT's Right Frontier 



Stergos D. Afantenos and Nicholas Asher 

Institut de recherche en informatique de Toulouse (IRIT), 
CNRS, Universite Paul Sabatier 

{ stergos . afantenos , nicholas . asher } @ irit . f r 



o 

(N 

O 

m 

U 

O 



> 

o 

00 
00 

o 
o 



X 



Abstract 

The Right Frontier Constraint (RFC), as a 
constraint on the attachment of new con- 
stituents to an existing discourse struc- 
ture, has important implications for the in- 
teipretation of anaphoric elements in dis- 
course and for Machine Learning (ML) ap- 
proaches to learning discourse structures. 
In this paper we provide strong empirical 
support for SDRT's version of RFC. The 
analysis of about 100 doubly annotated 
documents by five different naive annota- 
tors shows that SDRT's RFC is respected 
about 95% of the time. The qualitative 
analysis of presumed violations that we 
have performed shows that they are either 
click-errors or structural misconceptions. 

1 Introduction 

A cognitively plausible way to view the construc- 
tion of a discourse structure for a text is an incre- 
mental one. Interpreters integrate discourse con- 
stituent n into the antecedently constructed dis- 
course structure D for constituents 1 to n — 1 by 
linking n to some constituent in D with a dis- 
course relation. SDRT's Right Frontier Constraint 
(RFC) (Asher, 1993; Asher and Lascarides, 2003) 
says that a new constituent n cannot attach to an 
arbitrary node in D. Instead it must attach to ei- 
ther the last node entered into the graph or one of 
the nodes that dominate this last node. Assuming 
that the last node is usually found on the right of 
the structure, this means that the nodes available 
for attachment occur on the right frontier (RF) of 
the discourse graph or SDRS. 

Researchers working in different theoretical 
paradigms have adopted some form of this con- 
straint. Polanyi (1985; 1988) originally pro- 
posed the RFC as a constraint on antecedents to 



anaphoric pronouns. SDRT generalizes this to a 
condition on all anaphoric elements. As the at- 
tachment of new information to a contextually 
given discourse graph in SDRT involves the reso- 
lution of an anaphoric dependency, RFC furnishes 
a constraint on the attachment problem. (Webber, 
1988; Mann and Thompson, 1987; 1988) have 
also adopted versions of this constraint. But there 
are important differences. While SDRT and RST 
both take RFC as a constraint on all discourse at- 
tachments (in DLTAG, in contrast, anaphoric dis- 
course particles are not limited to finding an an- 
tecedent on the RF), SDRT's notion of RF is sub- 
stantially different from that of RST's or Polanyi's, 
because SDRT's notion of a RF depends on a 2- 
dimensional discourse graph built from coordinat- 
ing and subordinating discourse relations. Defin- 
ing RFC with respect to SDRT's 2-dimensional 
graphs allows the RF to contain discourse con- 
stituents that do not include the last constituent 
entered into the graph (in contrast to RST). SDRT 
also allows for multiple attachments of a con- 
stituent to the RFC. 

SDRT's RFC has important implications for the 
interpretation of various types of anaphoric ele- 
ments: tense (Lascarides and Asher, 1993), ellip- 
sis (Hardt et al., 2001; Hardt and Romero, 2004; 
Asher, 2007), as well as pronouns referring to in- 
dividuals and abstract entities (Asher, 1993; Asher 
and Lascarides, 2003). The RFC, we believe, will 
also benefit ML approaches to learning discourse 
structures, as a constraint limiting the search space 
for possible discourse attachments. Despite its 
importance, SDRT's RFC has never been empiri- 
cally validated, however. We present evidence in 
this paper providing strong empirical support for 
SDRT's version of the constraint. We have cho- 
sen to study SDRT's notion of a RF, because of 
SDRT's greater expressive power over RST (Dan- 
los, 2008), the greater generality of SDRT's defi- 



nition of RFC, and because of SDRT's greater the- 
oretical reliance on the constraint for making se- 
mantic predictions. SORT also makes theoretically 
clear why the RFC should apply to discourse re- 
lation attachment, since it treats discourse struc- 
ture construction as a dynamic process in which 
all discourse relations are essentially anaphors. 
The analysis of about 100 doubly annotated docu- 
ments by five different naive annotators shows that 
this constraint, as defined in SORT, is respected 
about 95% of the time. The qualitative analysis of 
the presumed violations that we have performed 
shows that they are either click-errors or structural 
misconceptions by the annotators. 

Below, we give a formal definition of SDRT's 
RFC; section 3 explains our annotation procedure. 
Details of the statistical analysis we have per- 
formed are given in section 4, and a qualitative 
analysis is provided in section 5. Finally, sec- 
tion 6 presents the implications of the empirical 
study for ML techniques for the extraction of dis- 
course structures while sections 7 and 8 present 
the related work and conclusions. 

2 The Right Frontier Constraint in SORT 

In SDRT, a discourse structure or SDRS (Seg- 
mented Discourse Representation Structure) is a 
tuple < ^, J", LAST >, where A is the set of 
labels representing the discourse constituents of 
the structure, LAST G A the last introduced label 
and T a function which assigns each member of 
A a well-formed formula of the SDRS language 
(defined (Asher and Lascarides, 2003, p 138)). 
SDRSs correspond to A expressions with a contin- 
uation style semantics. SDRT distinguishes coor- 
dinating and subordinating discourse relations us- 
ing a variety of linguistic tests (Asher and Vieu, 
2005),' and isolates structural relations (Parallel 
and Contrast) based on their semantics. 

The RF is the set of available attachment points 

'The subordinating relations of SDRT are currently: Elab- 
oration (a relation defined in terms of the main eventualities 
of the related constituents), Entity-Elaboration (E-Elab(a,b) 
iff b says more about an entity mentioned in a that is not the 
main eventuality of a) Comment, Flashback (the reverse of 
Narration), Background, Goal (intentional explanation). Ex- 
planation, and Attribution. The coordinating relations are: 
Narration, Contrast, Result, Parallel, Continuation, Alterna- 
tion, and Conditional, all defined in Asher and Lascarides 
(2003). 



to which a new utterance can be attached. What 
this set includes depends on the discourse relation 
used to make the attachment. Here is the defini- 
tion from (Asher and Lascarides, 2003, p 148). 

Suppose that a constituent /3 is to be attached to a 
constituent in the SDRS with a discourse relation 
other than Parallel or Contrast. Then the avail- 
able attachment points for /3 are: 

1. The label a = LAST; 

2. Any label 7 such that: 

(a) i-outscopes('y, a) (i.e. R{S,a) or 
R{a, 5) is a conjunct in ^^(7) for 
some R and some 5); or 

(b) -R(7, a) is a conjunct in J-{\) for 
some label A, where i? is a subordi- 
nating discourse relation. 

We gloss this as a < 7. 

3. Transitive Closure: 

Any label 7 that dominates a through a 
sequence of labels 71 , 72 , ■ ■ ■ 7n such that 

Q < 71 < 72 < • • • 7n < 7 

We can represent an SDRS as a graph Q, whose 
nodes are the labels of the SDRSs constituents and 
whose typed arcs represent the relations between 
them. The nodes available for attachment of a new 
element /3 in are the last introduced node LAST 
and any other node dominating LAST, where the 
notion of domination should be understood as the 
transitive closure over the arrows given by sub- 
ordinating relations or those holding between a 
complex segment and its parts. Subordinating re- 
lations like Elaboration extend the vertical dimen- 
sion of the graph, whereas coordinating relations 
like Narration expand the structure horizontally. 
The graph of every SDRS has a unique top label 
for the whole structure or formula; however, there 
may be multiple < paths defined within a given 
SDRS, allowing for multiple parents, in the ter- 
minology of (Wolf and Gibson, 2006). Further- 
more, SDRT allows for multiple arcs between con- 
stituents and attachments to multiple constituents 
on the RFC, making for a very rich structure. 

SDRT's RFC is restricted to non-structural rela- 
tions, because structural relations postulate a par- 
tial isomorphism from the discourse structure of 
the second constituent to the discourse structure 
of the first, which provides its own attachment 
possibilities for subconstituents of the two related 
structures (Asher, 1993). Sometimes such paral- 
lelism or contrast, also known as discourse subor- 
dination (Asher, 1993), can be enforced in a long 



distance way by repeating the same wording in the 
two constituents. 

RFC has the name it does because the segments 
that belong on this set (the 7s in the above def- 
inition) are typically nodes on a discourse graph 
which are geometrically placed at the RF of the 
graph. Consider the following example embel- 
lished from Asher and Lascarides (2003): 

(1) (tti) John had a great evening last night. (712) He first 
had a great meal at Michel SaiTan. (tts) He ate 
profiteroUes de foie gras, (7r4) which is a specialty of 
the chef, (tts) He had the lobster, (tts) which he had 
been dreaming about for weeks. (Try) He then went 
out to a several swank bars. 

The graph of the SDRS for 1 looks like this: 

(2) 

]/ Elaboration 
n' 



.Narration ^ 
^ hlaboration 



I N QiVTrQitliOTl I 

\lE-elab \i Background 



HA 



where vr' and vr" represent complex segments. 
Given that the last introduced utterance is repre- 
sented by the node vry, the set of nodes that are 
on the RF are tt-j (LAST), vr' (the complex segment 
that includes vry) and vri (connected via a subordi- 
nating relation to vr'). All those nodes are geomet- 
rically placed at the RF of the graph. 

SDRT's notion of a RF is more general than 
RST's or DLTAG's. First, SDRSs can have com- 
plex constituents with multiple elements linked 
by coordinate relations that serve as arguments 
to other relations, thus permitting instances of 
shared structure that are difficult to capture in a 
pure tree notation (Lee et al., 2008). In addi- 
tion, in RST the RF picks out the adjacent con- 
stituents, LAST and complex segments including 
LAST. Contrary to RST, SORT, as it uses 2- 
dimensional graphs, predicts that an available at- 
tachment point for vry is the non local and non ad- 
jacent 7r2, which is distinct from the complex con- 
stituent consisting of 112 to vrg.^ This difference 
is crucial to the interpretation of the NaiTation: 



Narration claims a sequence of two events; mak- 
ing the complex constituent (essentially a sub- 
SDRS) an argument of Narration, as RST does, 
makes it difficult to recover such an interpreta- 
tion. Danlos's (2008) interpretation of the Nu- 
clearity Principle provides an interpretation of the 
NaiTation([2-4],5) that is equivalent to the SDRS 
graph above.^ But even an optional Nucleaiiity 
Principle interpretation won't help with discourse 
structures like (2) where the backgrounding ma- 
terial in 7r4 and the commentary in ttq do not and 
cannot figure as part of the Elaboration for seman- 
tic reasons. In our corpus described below, over 
20% of the attachments were non adjacent; i.e. the 
attachment point for the new material did not in- 
clude LAST. 

A further difference between SORT and other 
theories is that, as SDRT's RFC is applied re- 
cursively over complex segments within a given 
SDRS, many more attachment points are available 
in SDRT. E.g., consider the SDRS for this example, 
adapted from (Wolf and Gibson, 2006): 

(3) (tti) Mary wanted garlic and thyme. {112) She also 
needed basil, (tts) The recipe called for them. (7r4) 
The basil would be hard to come by this time of year. 




Because vr is the complex segment consisting 
of TTi and 7r2, attachment to vr with a subordinat- 
ing discourse relation permits attachment tt's open 
constituents as well.^ 

3 Annotated Corpus 

Our corpus comes from the discourse structure an- 
notation project ANNODIS^ which represents an 
on going effort to build a discourse graph bank 
for French texts with the two-fold goal of test- 
ing various theoretical proposals about discourse 



The 2-dimensionality of SDRSs also allows us to rep- 



resent many examples with Elaboration that involve cross- 
ing dependencies in Wolf and Gibson's (2006) representation 
without violation of the RFC. 

'Baldridge et al. (2007), however, show that the Nuclear- 
ity Principle does not always hold. 

■*This part of the RFC was not used in (Asher and Las- 
carides, 2003). 

^http ://w3.erss. univ-tlse2 . f r / annodis 



structure and providing a seed corpus for learning 
discourse structures using ML techniques. ANN- 
ODlS's annotation manual provides detailed in- 
structions about the segmentation of a text into 
Elementary Discourse Units (EDUs). EDUs corre- 
spond often to clauses but are also introduced by 
frame adverbials,^ appositive elements, con^ela- 
tive constructions {[the more you work,] [the more 
you earn[), interjections and discourse markers 
within coordinated VPs [John denied the charges [ 
[ but then later admitted his guilt[. Appositive ele- 
ments often introduce embedded EDUs; e.g., [Jim 
Powers, [President of the University of Texas at 
Austin[, resigned today. [, which makes our seg- 
mentation more fine-grained than Wolf and Gib- 
son's (2006) or annotation schemes for RST or the 
PDTB. 

The manual also details the meaning of dis- 
course relations but says nothing about the struc- 
tural postulates of SORT. For example, there is no 
mention of the RFC in the manual and very little 
about hierarchical structure. Subjects were told 
to put whatever discourse relations from our list 
above between constituents they felt were appro- 
priate. They were also told that they could group 
constituents together whenever they felt that as a 
whole they jointly formed the term of a discourse 
relation. We purposely avoided making the man- 
ual too restrictive, because one of our goals was 
to examine how well SORT predicts the discourse 
structure of subjects who have little knowledge of 
discourse theories. 

In total 5 subjects with little to no knowledge 
of discourse theories that use RFC participated 
in the annotation campaign. Three were under- 
graduate linguistics students and two were grad- 
uate linguistics students studying different areas. 
The 3 undergraduates benefitted from a completed 
and revised annotation manual. The two gradu- 
ate students did their annotations while the anno- 
tation manual was undergoing revisions. All in 
all, our annotators doubly annotated about 100 
French newspaper texts and Wikipedia articles. 
Subjects first segmented each text into EDUs, and 
then they were paired off and compared their seg- 

^Frame adverbials are sentence initial adverbial phrases 
that can either be temporal, spatial or "topical" (in Chem- 
istry). 



mentations, resolving conflicts on their own or via 
a supervisor. The annotation of the discourse re- 
lations was performed by each subject working 
in isolation. ANNODIS provided a new state of 
the art tool, GLOZZ, for discourse annotation for 
the three undergraduates. With GLOZZ annotators 
could isolate sections of text coiTcsponding to sev- 
eral EDUs, and insert relations between selected 
constituents using the mouse. Though it did por- 
tray relations selected as lines between parts of the 
text, GLOZZ did not provide a discourse graph or 
SDRS as part of its graphical interface. The rep- 
resentation often yielded a dense number of lines 
between segments that annotators and evaluators 
found hard to read. The inadequate interline spac- 
ing in GLOZZ also contributed to certain number 
of click errors that we detail below in the paper. 
The statistics on the number of documents, EDUs 
and relations provided by each annotator are in ta- 
ble 1. 



annotator 


#Docs 


# EDUi 


# Relations 


undergrad 1 


27 


1342 


1216 


undergrad 2 


31 


1378 


1302 


undergrad 3 


31 


1376 


1173 


grad 1 


47 


1387 


1390 


grad 2 


48 


1314 


1321 



Table 1 : Statistics on documents, EDUs and Rela- 
tions. 



4 Experiments and Results 

Using ANNODIS 's annotated corpus, we checked 
for all EDUs vr, whether vr was attached to a con- 
stituent in the SDRS built from the previous EDUs 
in a way that violated the RFC. Given a discourse 
as a series of EDUs vri, 7r2, . . . , vr^, we constructed 
for each vTj the corresponding sub-graph and cal- 
culated the set of nodes on the RE of this sub- 
graph. We then checked whether the EDU vrj+i 
was attached to a node that was found in this set. 
We also checked whether any newly created com- 
plex segment was attached to a node on the RF of 
this sub-graph. 

4.1 Calculating the Nodes at the RF 

To calculate the nodes on the re, we slightly ex- 
tended the annotated graphs, in order to add im- 



plied relations left out by the annotators7 

Disconnected Graphs While checking the RFC 
for the attachment of a node n, the SDKS graph 
at this point might consist of 2 or more disjoint 
subgraphs which get connected together at a later 
point. Because we did not want to decide which 
way these graphs should be connected, we defined 
a right frontier for each one using its own LAST. 
We then calculated the RF for each one of them 
and set the set of available nodes to be those in 
the union of the RFs of the disjoint subgraphs. If 
the subgraphs were not connected at the end of 
the incremental process in a way that conformed 
to RFC, we counted this as a violation. Annotators 
did not always provide us with a connected graph. 

Postponed Decisions sort allows for the at- 
tachment not only of EDUs but also of subgraphs 
to an available node in the contextually given 
SDRS. For instance, in the following example, the 
intended meaning is given by the graph in which 
the Contrast is between the first label and the com- 
plex constituent composed of the disjunction of 7^2 
and TTa. 

(tti) Bill doesn't like sports. (772) But Sam does. 
(tts) Or John does. 




Altern. "'^ 

Naive annotators attached subgraphs instead of 
EDUs to the RF with some regularity (around 2%). 
This means that an EDU vTj+i could be attached to 
a node that was not present in the subgraph pro- 
duced by VTl, . . . , TTj. There were two main rea- 
sons for this: (1) vTj+i came from a syntactically 
fronted clause, a parenthetical or apposition in a 
sentence whose main clause produced 7rj_|_2 and 
vTi+i was attached to TTi+2', (2) vTj+i was attached 
to a complex segment [. . . , vTj+i, . . . , vrj+^t, . . .] 
which was not yet introduced in the subgraph. 

Since the nodes to which vTj+i is attached in 
such cases are not present in the graph, by def- 
inition they are not in the RF and they could be 
counted as violations. Nonetheless, if the nodes 



In similar work on TimeML annotations, Setzer et al. 
(2003; Muller and Raymonet (2005) add implied relations to 
annotated, temporal graphs. 



which connect nodes Uke TTj+i eventually link up 
to the incrementally built SDRS in the right way, 
vTj+i might eventually end up linked to something 
on the RF. For this reason, we postponed the de- 
cision on nodes like vTj+i until the nodes to which 
they are attached were explicitly introduced in the 
SDRS. 

The Coherence of Complex Segments In an 

SDRS, several EDUs may combine to form a com- 
plex segment a that serves as a term for a dis- 
course relation R. The interpretation of the SDRS 
implies that all of a's constituents contribute to 
the rhetorical function specified by R. This im- 
plies that the coordinating relation Continuation 
holds between the EDUs inside a, unless there is 
some other relation between them that is incom- 
patible with Continuation (like a subordinating 
relation). Continuations ai^e often used in SDRT 
(Asher, 1993; Asher and Lascaiides, 2003). Dur- 
ing the annotation procedure, our subjects did not 
always explicitly link the EDUs within a complex 
segment. In order to enforce the coherence of 
those complex segments we added Continuation 
relations between the constituents of a complex 
segment unless there was already another path be- 
tween those constituents. 

Expanding Continuations Consider the fol- 
lowing discourse: 

(4) [John, [who owns a chain of restaurants] ^2 > [^nd is a 
director of a local charity organization,]^, wanted to 
sell his yacht.]^j [He couldn't afford it anymore. ].^j 

Annotators sometimes produced the following 
SDRT graph for the first three EDUs of this dis- 
course: 

(5) 

^E-Elab 

^ Continuation ^ 

In this case the only open node is ir^ due to 
the coordinating relation Continuation. Nonethe- 
less, TTi should be attached to vri, without vi- 
olating the RFC. Indeed, SDRT's definition of 
the Continuation relation enforces that if we have 
i?(7ri,7r2) and Continuation(7r2, 713) then we ac- 
tually have the complex segment [7r2,7r3] with 
i?(7ri, [7r2, TTs]). So there is in fact a missing com- 
plex segment in (5). The proper SDRS graph of (4) 
is: 



\lE-Elab 



n 




Continuation 



which makes vri an available attachment site for 
ir^. Such implied constituents have been added to 
the SDRS graphs. 

Factoring Related to the operation of Ex- 
pansion, SDRT's definition of Continuation and 
various subordinating relations also requires 
that if we have i?(a, [vri, 7r2, . . . , 7r„]) where 
[ti"! 5 , • • • , T^n] is a complex segment with 
vTi , . . . 7r„ linked by Continuation and R is Elabo- 
ration, Entity-Elaboration, Frame, Attribution, or 
Commentary, then we also have R{a, vTj) for each 
i. We added these relations when they were miss- 
ing. 

4.2 Results 

With the operations just described, we added sev- 
eral inferred relations to the graph. We then cal- 
culated statistics concerning the percentage of at- 
tachments for which the RFC is respected using 
the following formula: 

# EDUs attached to the RF 

RFCedv = 

# EDUS in total 

As we explained, an EDU can be attached to an 
SORT graph directly by itself or indirectly as part 
of a bigger complex segment. In order to calcu- 
late the nominator we determine first whether an 
EDU directly attaches to the graph's RF, and if that 
fails we determine whether it is part of a larger 
complex segment which is attached to the graph's 
RF. The results obtained are shown in the first two 
columns of table 2. The RFC is respected by at 
least some attachment decision 95% of the time — 
i.e., 95% of the EDUs get attached to another node 
that is found on the RF. The breakdown across our 
annotators is given in table 2. 

SDRT allows for multiple attachments of an 
EDU to various nodes in an SDRS; e.g. while an 
EDU may be attached via one relation to a node 
on the RF, it may be attached to another node off 
the RF. To take account of all the attachments for a 
given EDU, we need another way of measuring the 



percentage of attachments that respects the RFC. 
So we counted the ways each EDU is related to a 
node in the SDRS for the previous text and then 
divided the number of attachment decisions that 
respect the RFC by the total number of attachment 
decisions — i.e. : 

# RF attachment decisions 
Rr Or = 

# Total attachment decisions 



annotator 


RFCedv 


RFCr 


undergrad 1 


98.57% 


91.28% 


undergrad 2 


98.12% 


94.39% 


undergrad 3 


91.93% 


89.17% 


grad 1 


94.38% 


86.54% 


grad 2 


92.68% 


83.57% 


Mean for all annotators 
Mean for 3 undergrad 


95.24% 
96.17% 


88.91% 
91.71% 



Table 2: The % with which each annotator has re- 
spected SDRT's RFC using the EDU and attachment 
decision measures. 



The third column of table 2 shows that having 
a stable annotation manual and GLOZZ improved 
the results across our two annotator populations, 
even though the annotation manual did not say 
anything about RFC or about the structure of the 
discourse graphs. Moreover, the distribution of vi- 
olations of the RFC follows a power law and only 
4.56% of the documents contained more than 5 vi- 
olations. This is strong evidence that there is little 
propagation of violations. 

5 Analysis of Presumed Violations 

Although 95% of EDUs attach to nodes on the 
RF of an SDRT graph, 5% of EDUs don't. SDRT 
experts performed a qualitative analysis of some 
of these presumed violations. In many cases, the 
experts judged that the presumed violations were 
due to click-eiTors: sometimes the annotators sim- 
ply clicked on something that did not translate into 
a segment. Sometimes, the experts judged that the 
annotators picked the wrong segment to attach a 
new segment or the wrong type of relation during 
the construction of the SDRT graph. For example, 
in the graph that follows the relation between seg- 
ments 74 and 75 is not a Comment but an Entity- 
Elaboration. 



As expected, there were also "structural" er- 
rors, arising from a lack or a misuse of complex 
segments. Here is a typical example (translated 
from the original French): 

[Around her,]_74 [we should mention Joseph 
Racaille]_75 [responsible for the magnificent ar- 
rangements,]_76 [Christophe Dupouy]_77 [reg- 
ular associate of Jean-Louis Murat responsi- 
ble for mixing,]_78 [without forgetting her two 
guardian angels:]_79 [her agent Olivier Gluz- 
man]_80 [who signed after a love at first 
sight,]_8 1 [and her husband Mokhtar]_82 [who 
has taken care of the family]_83 

Here is the annotated structure up to edu 78: 
74 

I Comment 

75 — : ^77 

^E-elab 



tinuation 



\E-elab 
76 



Cont 

78 (LAST) 

Note that the attachment of 77 to 75 is non-local 
and non-adjacent. The annotator then attaches 
EDU 79 to 75 which is blocked from the RF due to 
the Continuation coordinating relation. By not 
having created a complex segment due the enu- 
meration that includes EDUs 75 to 78, the annota- 
tor had no option but to violate the RF. Here is the 
proper SORT graph for segments 74 to 79 (where 
the attachment of 79 to 74 is also both non-local 
and non-adjacent): 

Elah 




In this case, before the introduction of EDU 79, 
EDU 78 is LAST and by consequence 77, vr and 74 
are on the RF. Attaching 79 to 74 is thus legiti- 
mate. 

We also found more interesting examples of 
right frontier violations. One annotator produced 
a graph for a story which is about the attacks of 
9/11/2001 and is too long to quote here. A sim- 
plified graph of the first part of the story is shown 
below. EDU 4 elaborates on the main event of the 
story but it is not on the RF for 19. However, 19 
is the first recuiTcnce of the complex definite de- 
scription le 11 septembre 2001 since the title and 
the term's definition in EDU 4. 




Result 



^1-13^ 



„ ^,[4 4-16] 

KesuM I „ 

\ if Comment 

\l9 

This reuse of the full definite description could be 
considered a case of SDRT's discourse subordina- 
tion. 

6 RFC and distances of attachment 

Our empirical study vindicates SDRT's RFC, but 
it also has computational implications. Using the 
RFC dramatically diminishes the number of at- 
tachment possibilities and thus greatly reduces the 
search space for any incremental discourse pars- 
ing algorithm.^ The mean of nodes that aie open 
on the RF at any given moment on our ANNODIS 
data is 16.43% of all the nodes in the graph. 

Our data also allowed us to calculate the dis- 
tance of attachment sites from LAST, which could 
be an important constraint on machine leai^ning 
algorithms for constructing discourse structures. 
Given a pair of constituents {-Ki^iTj) distance is 
calculated either textually (the number of inter- 
vening EDUs between tt^ and tTj) or topologically 
(the length the shortest path between vTj and -Kj). 
Topological distance, however, does not take into 
account the fact that a textually further segment is 
cognitively less salient. Moreover, this measure 
can give the same distance to nodes that are textu- 
ally far away between them due to long distance 
pop-ups (Asher and Lascarides, 2003). A purely 
textual distance, on the other hand, gives the same 
distance to an EDU vTj and a complex segment 
[vTi, . . . , TTj] even if tti and vTj are textually dis- 
tant (since both have the same span end). We used 
a measure combining both. The distance scheme 
that we used assigns to each EDU its textual dis- 
tance from LAST in the graph under consideration, 
while a complex segment of rank 1 gets a distance 
which is computed from the highest distance of 
their constituent EDUs plus 1 . For a constituent a 
of rank n we have: 

Dist = Maa;{dist(x) : xma} + n 



An analogous approach for search space reduction is fol- 
lowed by duVerle and Prendinger (2009) who use the "Prin- 
ciple of Sequentiality" (Marcu, 2000), though they do not say 
how much the search space is reduced. 



The distribution of attachment follows a power 
law with 40% of attachments performed non- 
locally, that is on segments of distance 2 or more 
(figure 1). This implies that the distance between 
candidate attachment sites that are on the RF is an 
important feature for an ML algorithm. It is impor- 
tant to note at this point that following the baseline 
approach of always attaching on the LAST misses 
40% of attachments. We also have 20.38% of the 
non-local, non-adjacent attachments in our anno- 
tations. So an RST parser using Marcu's (2000) 
adjacency constraint as do duVerle and Prendinger 
(2009) would miss these. 




2 4 6 8 10 12 14 16 18 20 
Attachment distance 



Figure 1 : Distribution of attachment distance 
7 Related Work 

Several studies have shown that the RFC may be 
violated as an anaphoric constraint when there 
are other clues, content or linguistic features, that 
determine the antecedent. (Poesio and di Euge- 
nio, 2001; Holler and Irmen, 2007; Asher, 2008; 
Prevot and Vieu, 2008), for example, show that 
anaphors such as definite descriptions and com- 
plex demonstratives, which often provide enough 
content on their own to isolate their antecedents, 
or pronouns in languages like German which must 
obey gender agreement, might remain felicitous 
although the discourse relations between them and 
their antecedents might violate the RFC. Usually 
there are few linguistic clues that help find the 
appropriate antecedent to a discourse relation, in 
contrast to the anaphoric expressions mentioned 
above. Exceptions involve stylistic devices like 
direct quotation that license discourse subordina- 
tion. Thus, SORT predicts that RFC violations for 



discourse attachments should be much more rare 
than those for the resolution of anaphors that pro- 
vide linguistic clues about their antecedents. 

As regards other empirical validation of var- 
ious versions of the RFC for the attachment of 
discourse constituents. Wolf and Gibson (2006) 
show an RST-like RFC is not supported in their 
corpus GraphBank. Our study concurs in that 
some 20% of the attachments in our corpus can- 
not be formulated in RST.^ On the other hand, 
we note that because of the 2 dimensional nature 
of SORT graphs and because of the caveats intro- 
duced by structural relations and discourse sub- 
ordination, the counterexamples from GraphBank 
against, say, RST representations do not caiTy over 
straightforwardly to SDRSs. In fact, once these 
factors are taken into account, the RFC violations 
in our corpus and in GraphBank are roughly about 
the same. 

8 Conclusions 

We have shown that sort's RFC has strong empir- 
ical support: the attachments of our 3 completely 
naive annotators fully comply with RFC 91.7% of 
the time and partially comply with it 96% of the 
time. As a constraint on discourse parsing sort's 
RFC, we have argued, is both empirically and 
computationally motivated. We have also shown 
that non-local attachments occur about 40% of the 
time, which implies that attaching directly on the 
LAST will not yield good results. Further, many of 
the non local attachments do not respect RST's ad- 
jacency constraint. We need sort's RFC to get the 
right attachment points for our corpus. We believe 
that empirical studies of the kind we have given 
here are essential to finding robust and useful fea- 
tures that will vastly improve discourse pai^sers. 



'One other study we are aware of is Sassen and Kiihn- 
lein (2005), who show that in chat conversations, the RFC 
does not always hold unconditionally. Since this genre of 
discourse is not always coherent, it is expected that the RFC 
will not always hold here. 



References 

Asher, N. and A. Lascarides. 2003. Logics of Con- 
versation. Studies in Natural Language Processing. 
Cambridge University Press, Cambridge, UK. 

Asher, N. and L. Vieu. 2005. Subordinating and co- 
ordinating discourse relations. Lingua, 115(4):591- 
610. 

Asher, N. 1993. Reference to Abstract Objects in Dis- 
course. Kluwer Academic Publishers. 

Asher, N. 2007. A large view of semantic content. 
Pragmatics and Cognition, 15(l):17-39. 

Asher, N. 2008. Troubles on the right frontier 
In Benz, A. and P. Kiihnlein, editors. Constraints 
in Discourse, Pragmatics and Beyond New Series, 
chapter 2, pages 29-52. John Benjamins Publishing 
Company. 

Baldridge, J., N. Asher, and J. Hunter 2007. An- 
notation for and robust parsing of discourse struc- 
ture on unrestricted texts. Zeitschrift fur Sprachwis- 
senschaft, 26:213-239. 

Danlos, L. 2008. Strong generative capacity of rst, 
sdrt and discourse dependency dags. In Benz, A. 
and P. Kiihnlein, editors. Constraints in Discourse, 
Pragmatics and Beyond New Series, pages 69-95. 
John Benjamins Publishing Company. 

duVerle, D. and H. Prendinger. 2009. A novel dis- 
course parser based on support vector machine clas- 
sification. In Proceedings of ACL, pages 665-673, 
Suntec, Singapore, August. 

Hardt, D. and M. Romero. 2004. Ellipsis and 
the structure of discourse. Journal of Semantics, 
21:375-414, November. 

Hardt, D., N. Asher, and J. Busquets. 2001. Discourse 
parallelism, scope and ellipsis. Journal of Seman- 
tics, 18:1-16. 

Holler, A. and L. Irmen. 2007. Empirically assessing 
effects of the right frontier constraint. In Anaphora: 
Analysis, Algorithms and Applications, pages 15- 
27. Springer, Berlin/Heidelberg. 

Lascarides, A. and N. Asher 1993. Temporal interpre- 
tation, discourse relations and commonsense entail- 
ment. Linguistics and Philosophy, 16(5):437-493. 

Lee, A., R. Prasad, A. Joshi, and B. Webber. 2008. 
Departures from tree structures in discourse: Shared 
arguments in the penn discourse treebank. In Con- 
straints in Discourse ( CID '08), pages 61-68. 



Mann, W. and S. Thompson. 1987. Rhetorical struc- 
ture theory: A framework for the analysis of texts. 
Technical Report lSl/RS-87-185, Information Sci- 
ences Institute, Marina del Rey, California. 

Mann, W. and S. Thompson. 1988. Rhetorical struc- 
ture theory: Towards a functional theory of text or- 
ganization. Text, 8(3):243-281. 

Marcu, D. 2000. The Theory and Practice of Dis- 
course Parsing and Summarization. The MIT Press. 

Muller, P. and A. Raymonet. 2005. Using inference 
for evaluating models of temporal discourse. In 
12th International Symposium on Temporal Repre- 
sentation and Reasoning, pages 1 1-19. IEEE Com- 
puter Society Press. 

Poesio, M. and B. di Eugenio. 2001. Discourse struc- 
ture and anaphoric accessibility. In Proc. of the 
ESSLLI Workshop on Discourse Structure and In- 
formation Structure, August. 

Polanyi, L. 1985. A theory of discourse structure and 
discourse coherence. In Kroeber, P. D., W. H. Eil- 
fort, and K. L. Peterson, editors. Papers from the 
General Session at the 21st Regional Meeting of the 
Chicago Linguistics Society. 

Polanyi, L. 1988. A formal model of the structure of 
discourse. Journal of Pragmatics, 12:601-638. 

Prevot, L. and L. Vieu. 2008. The moving right fron- 
tier In Benz, A. and P. Kiihnlein, editors, Con- 
straints in Discourse, Pragmatics and Beyond New 
Series, chapter 3, pages 53-66. John Benjamins 
Publishing Company. 

Sassen, C. and P. Kiihnlein. 2005. The right fron- 
tier constraint as conditional. In Computational 
Linguistics and Intelligent Text Processing, Lecture 
Notes in Computer Science (LNCS), pages 222- 
225. 

Setzer, A., R. Gaizauskas, and M. Hepple. 2003. 
Using semantic inferences for temporal annotation 
comparison. In Proceedings of the Fourth Interna- 
tional Workshop on Inference in Computational Se- 
mantics (ICoS-4). 

Webber, B. 1988. Title discourse deixis and discourse 
processing. Technical Report MS-CIS-88-75, Uni- 
versity of Pennsylvania, Department of Computer 
and Information Science, September 

Wolf, F. and E. Gibson. 2006. Coherence in Natural 
Language: Data Stuctures and Applications. The 
MIT Press. 



